Array Restructuring for Cache Locality

نویسندگان

  • Shun-Tak A. Leung
  • Albert Leung
  • John Zahorjan
چکیده

Array Restructuring for Cache Locality by Shun-Tak Albert Leung Chairperson of Supervisory Committee: Professor John Zahorjan Department of Computer Science and Engineering Caches are used in almost every modern processor design to reduce the long memory access latency, which is increasingly a bottleneck to program performance. For caches to be effective, programs must exhibit good data locality. Thus, an optimizing compiler may have to restructure programs to enhance their locality. We focus on the class of restructuring techniques that target array accesses in loops. There are two approaches to enhancing the locality of such accesses: loop restructuring and array restructuring. Under loop restructuring, a compiler adopts a canonical array layout but transforms the order in which loop iterations are performed and thereby reorders the execution of array accesses. Under array restructuring, in contrast, a compiler lays out array elements in an order that matches the access pattern, while preserving the flow of control. While loop restructuring has been studied extensively, array restructuring has received much less attention despite advantages such as its applicability to complicated loop structures that may hamper loop restructuring. To fill the void, this dissertation investigates how to perform array restructuring effectively — efficiently, automatically, and generally. We present a formal framework for array transformations that meet these objectives. Such transformations are represented by linear transformations of array index vectors. Within this framework, we develop algorithms to solve various problems in array restructuring: selecting transformations based on the access pattern, laying out elements of restructured arrays, and determining which elements are accessed by a loop and thus restructuring only that part of an array. To evaluate our array restructuring technique, we implemented a prototype compiler and performed a series of experiments with loops commonly used in related loop restructuring studies. Experimental measurements showed that array restructuring improved performance substantially in many cases, despite a modest runtime overhead in some. Moreover, the results also indicated that array restructuring complemented loop restructuring in applicability and performance: it applied where loop restructuring did not; when both applied, it offered comparable, sometimes even better, performance; in cases where it did not perform as well, loop restructuring improved performance considerably anyway. This observation points to the potential benefit of integrating the two complementary approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Quantitative Algorithm for Data Locality Optimization

In this paper, we consider the problem of optimizing register allocation and cache behavior for loop array references. We exploit techniques developed initially for data locality estimation and improvement in the framework of cache or local memories. First we review the concept of \reference window" that serves as our basic tool for both data locality evaluation and management. Then we study ho...

متن کامل

Fast indexing for blocked array layouts to reduce cache misses

The increasing disparity between memory latency and processor speed is a critical bottleneck in achieving high performance. Recently, several studies have been conducted on blocked data layouts, in conjunction with loop tiling to improve locality of references. In this paper, we further reduce cache misses, restructuring the memory layout of multi-dimensional arrays, so that array elements are ...

متن کامل

A Cost Model For Integrated Restructuring Optimizations

Compilers must make choices between different optimizations; in this paper we present an analytic cost model that compares several compile-time optimizations for memory-intensive, matrix-based codes. These optimizations increase the spatial locality of references to improve cache hierarchy performance. Specifically, we consider loop transformations, array restructuring, and address remapping, a...

متن کامل

Non-linear memory layout transformations and data prefetching techniques to exploit locality of references for modern microprocessor architectures with multilayered memory hierarchies PHD THESIS

One of the key challenges computer architects and compiler writers are facing, is the increasing discrepancy between processor cycle times and main memory access times. To overcome this problem, program transformations that decrease cache misses are used, to reduce average latency for memory accesses. Tiling is a widely used loop iteration reordering technique for improving locality of referenc...

متن کامل

Optimizing Data Locality by Array Restructuring

It is increasingly important that optimizing compilers restructure programs for data locality to obtain high performance on today's powerful architectures. In this paper, we focus on array restructuring , a technique that improves the spatial locality exhibited by array accesses in nested loops. Speci cally, we address the following question: Given a set of such accesses, how should the array e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996